Analyzing G25 corrdinates of ancient DNA samples using graphs

In [1]:
require(rio)
require(tidyverse)
require(cowplot)
require(gridExtra)
require(dbscan)
require(igraph)
require(tidygraph)
require(ggraph)
require(pals)
Loading required package: cowplot


Attaching package: ‘cowplot’


The following object is masked from ‘package:lubridate’:

    stamp


Loading required package: gridExtra


Attaching package: ‘gridExtra’


The following object is masked from ‘package:dplyr’:

    combine


Loading required package: dbscan


Attaching package: ‘dbscan’


The following object is masked from ‘package:stats’:

    as.dendrogram


Loading required package: igraph


Attaching package: ‘igraph’


The following objects are masked from ‘package:stats’:

    decompose, spectrum


The following objects are masked from ‘package:lubridate’:

    %--%, union


The following objects are masked from ‘package:dplyr’:

    as_data_frame, groups, union


The following objects are masked from ‘package:purrr’:

    compose, simplify


The following object is masked from ‘package:tidyr’:

    crossing


The following object is masked from ‘package:tibble’:

    as_data_frame


The following object is masked from ‘package:base’:

    union


Loading required package: tidygraph


Attaching package: ‘tidygraph’


The following object is masked from ‘package:igraph’:

    groups


The following object is masked from ‘package:stats’:

    filter


The following object is masked from ‘package:rio’:

    convert


Loading required package: ggraph

Loading required package: pals

Load the data for modern and ancient G25 population averages

In [2]:
avgs = as_tibble(import("../Genetics/G25/Data/TXT/Global25_PCA_pop_averages_scaled.txt"))
#avgs %>% sample_n(10)
modavgs = as_tibble(import("../Genetics/G25/Data/TXT/Global25_PCA_modern_pop_averages_scaled.txt"))
#modavgs %>% sample_n(10)

Optionally, filter the dataset down

In [3]:
str_extract(avgs$V1,"^\\w+?(?=_)") %>% unique(sort = TRUE)
  1. 'ARG'
  2. 'ARM'
  3. 'AUS'
  4. 'AUT'
  5. 'AZE'
  6. 'Baltic'
  7. 'BEL'
  8. 'Bell'
  9. 'BGR'
  10. 'BHS'
  11. 'BLZ'
  12. 'BRA'
  13. 'BWA'
  14. 'CAN'
  15. 'Canary'
  16. 'Channel'
  17. 'CHE'
  18. 'CHL'
  19. 'CHN'
  20. 'CMR'
  21. 'COG'
  22. 'Corded'
  23. 'CUW'
  24. 'CZE'
  25. 'DEU'
  26. 'DNK'
  27. 'DOM'
  28. 'EGY'
  29. 'England'
  30. 'ETH'
  31. 'FIN'
  32. 'FRA'
  33. 'GEO'
  34. 'Gepidian'
  35. 'GRC'
  36. 'Greater'
  37. 'GUM'
  38. 'HRV'
  39. 'HTI'
  40. 'HUN'
  41. 'Hun'
  42. 'Iberia'
  43. 'IDN'
  44. 'IND'
  45. 'IRL'
  46. 'IRN'
  47. 'ISL'
  48. 'Isle'
  49. 'ITA'
  50. 'JPN'
  51. 'KAZ'
  52. 'KEN'
  53. 'KGZ'
  54. 'KOR'
  55. NA
  56. 'LAO'
  57. 'Levant'
  58. 'LUX'
  59. 'MAR'
  60. 'MDA'
  61. 'MEX'
  62. 'MKD'
  63. 'MNG'
  64. 'MWI'
  65. 'MYS'
  66. 'NLD'
  67. 'NOR'
  68. 'NPL'
  69. 'Orkney'
  70. 'Ostrogothic'
  71. 'PAK'
  72. 'PAN'
  73. 'PER'
  74. 'POL'
  75. 'PYF'
  76. 'ROU'
  77. 'RUS'
  78. 'Saka'
  79. 'Sargat'
  80. 'Sarmatian'
  81. 'Scotland'
  82. 'Scythian'
  83. 'SDN'
  84. 'SRB'
  85. 'SVK'
  86. 'SVN'
  87. 'SWE'
  88. 'SYR'
  89. 'THA'
  90. 'TJK'
  91. 'TKM'
  92. 'TON'
  93. 'TUR'
  94. 'TWN'
  95. 'TZA'
  96. 'UGA'
  97. 'UKR'
  98. 'USA'
  99. 'UZB'
  100. 'VEN'
  101. 'VK2020'
  102. 'VNM'
  103. 'VUT'
  104. 'Wales'
  105. 'Yamnaya'
  106. 'ZAF'
In [4]:
str_extract(modavgs$V1,"^\\w+?$") %>% unique(sort = TRUE)
  1. 'Abazin'
  2. 'Abkhasian'
  3. 'Abkhasian_Gudauta'
  4. 'Adygei'
  5. 'Aeta'
  6. 'Afrikaner'
  7. 'Agta'
  8. 'Ahiska'
  9. 'Akha'
  10. 'Akhvakh'
  11. 'Alawite'
  12. 'Albanian'
  13. 'Alevi_Dersim'
  14. 'Algerian'
  15. 'Altaian'
  16. 'Amerindian_North'
  17. 'Ami'
  18. 'Andian_A'
  19. 'Andian_B'
  20. 'Arain'
  21. 'Armenian_Aintab'
  22. 'Armenian_Ararat'
  23. 'Armenian_Artsakh'
  24. 'Armenian_Erzurum'
  25. 'Armenian_Gesaria'
  26. 'Armenian_Gurin'
  27. 'Armenian_Hemsheni'
  28. 'Armenian_Parspatunik'
  29. 'Armenian_Syunik'
  30. 'Armenian_Urfa'
  31. 'Arora'
  32. 'Ashkenazi_Belarussia'
  33. 'Ashkenazi_Germany'
  34. 'Ashkenazi_Lithuania'
  35. 'Ashkenazi_Poland'
  36. 'Ashkenazi_Russia'
  37. 'Ashkenazi_Ukraine'
  38. 'Assyrian'
  39. 'Asur'
  40. 'Atayal'
  41. 'Australian'
  42. 'Austrian'
  43. 'Avar'
  44. 'Awan'
  45. 'Aymara'
  46. 'Azerbaijani_Dagestan'
  47. 'Azerbaijani_Iran'
  48. 'Azerbaijani_Republic'
  49. 'Azerbaijani_Republic_Gabala'
  50. 'Azerbaijani_Republic_Shaki'
  51. 'Azerbaijani_Turkey'
  52. 'Bagvalin'
  53. 'Bagvalin_o'
  54. 'Bahun'
  55. 'Bahun_o'
  56. 'Bai'
  57. 'Baiku_Yao_Guizhou'
  58. 'Bajo'
  59. 'Baka'
  60. 'Bakola'
  61. 'Balija'
  62. 'Balkar'
  63. 'Balochi'
  64. 'Balti'
  65. 'Balti_o'
  66. 'Baniya_Gujarat'
  67. 'Baniya_Punjab'
  68. 'Baniya_Uttar_Pradesh_Gupta'
  69. 'Bantu_Kenya'
  70. NA
  71. 'Baoan'
  72. 'Bashkir'
  73. 'Basque_Araba'
  74. 'Basque_Baztan'
  75. 'Basque_Biscay'
  76. 'Basque_French'
  77. 'Basque_Gipuzkoa'
  78. 'Basque_Gipuzkoa_Southwest'
  79. 'Basque_Lower_Navarre'
  80. 'Basque_Navarre_Center'
  81. 'Basque_Navarre_North'
  82. 'Basque_Roncal'
  83. 'Basque_Soule'
  84. 'Basque_Spanish'
  85. 'Batak'
  86. 'BedouinA'
  87. 'BedouinB'
  88. 'Bedzan'
  89. 'Belarusian'
  90. 'BelgianA'
  91. 'BelgianB'
  92. 'BelgianC'
  93. 'Bengali_Bangladesh'
  94. 'Bengali_Bangladesh_SouthEast'
  95. 'Bengali_Bangladesh_Sylhet'
  96. 'Berber_Algeria'
  97. 'Berber_MAR_ERR'
  98. 'Berber_MAR_TIZ'
  99. 'Berber_Tunisia_Chen'
  100. 'Berber_Tunisia_Sen'
  101. 'Besermyan'
  102. 'Bhumihar_Bihar'
  103. 'Bhumij'
  104. 'Biaka'
  105. 'Birhor'
  106. 'Bitonga'
  107. 'Blang'
  108. 'Bolivian_Cochabamba'
  109. 'Bolivian_LaPaz'
  110. 'Bolivian_Pando'
  111. 'Bonan'
  112. 'Bonda'
  113. 'Bosnian'
  114. 'Brahmin_Gujarat'
  115. 'Brahmin_Gujarat_Audichya'
  116. 'Brahmin_Gujarat_Bardai'
  117. 'Brahmin_Gujarat_Nagar'
  118. 'Brahmin_Gujarat_o'
  119. 'Brahmin_Himachal_Pradesh_West'
  120. 'Brahmin_Jammu_Dogra'
  121. 'Brahmin_Karnataka_Tulu'
  122. 'Brahmin_Kerala_Nambudiri'
  123. 'Brahmin_Konkani_Catholic'
  124. 'Brahmin_Kumaon'
  125. 'Brahmin_Manipuri'
  126. 'Brahmin_Rajasthan'
  127. 'Brahmin_Tamil_Nadu'
  128. 'Brahmin_Tamil_Nadu_Iyengar'
  129. 'Brahmin_Tamil_Nadu_Iyer'
  130. 'Brahmin_Telugu'
  131. 'Brahmin_Telugu_Niyogi'
  132. 'Brahmin_Telugu_Vaidiki'
  133. 'Brahmin_UP_Awadh_Saryupareen'
  134. 'Brahmin_UP_Kanyakubja'
  135. 'Brahmin_UP_Lucknow'
  136. 'Brahmin_UP_Lucknow_o'
  137. 'Brahmin_Uttar_Pradesh_East'
  138. 'Brahmin_Uttar_Pradesh_East_o'
  139. 'Brahmin_West_Bengal'
  140. 'Brahui'
  141. 'Bukharian_Jew'
  142. 'Bulala'
  143. 'Bulgarian'
  144. 'Bunt_Kerala'
  145. 'Burmese'
  146. 'Burusho'
  147. 'Buryat'
  148. 'Cachi'
  149. 'Cambodian'
  150. 'Cameroon_Aghem'
  151. 'Cameroon_Bafut'
  152. 'Cameroon_Bakoko'
  153. 'Cameroon_Bangwa'
  154. 'Cameroon_Mbo'
  155. 'Chamalin'
  156. 'Chamar_Uttar_Pradesh'
  157. 'Chamar_Uttar_Pradesh_o'
  158. 'Changana'
  159. 'Changshan_Yao_Guizhou'
  160. 'Chechen'
  161. 'Chenchu'
  162. 'Cherkes'
  163. 'Chipewyan'
  164. 'Chopi'
  165. 'Chukchi'
  166. 'Chuvash'
  167. 'Circassian'
  168. 'Cochin_Jew_A'
  169. 'Cochin_Jew_B'
  170. 'Colla'
  171. 'Cossack_Kuban'
  172. 'Cossack_Ukrainian'
  173. 'Cree'
  174. 'Croatian'
  175. 'Cypriot'
  176. 'Czech'
  177. 'Dai'
  178. 'Damai'
  179. 'Danish'
  180. 'Darginian'
  181. 'Datog'
  182. 'Daur'
  183. 'Dharkar'
  184. 'Dinka'
  185. 'Dolgan'
  186. 'Dong_Guizhou'
  187. 'Dong_Hunan'
  188. 'Dongxiang'
  189. 'Druze'
  190. 'Dungan'
  191. 'Dusadh'
  192. 'Dusun'
  193. 'Dutch'
  194. 'Egyptian'
  195. 'Elmolo'
  196. 'EmiratiA'
  197. 'EmiratiB'
  198. 'EmiratiC'
  199. 'English'
  200. 'English_Cornwall'
  201. ⋯
  202. 'Russian_Pinezhsky'
  203. 'Russian_Pskov'
  204. 'Russian_Ryazan'
  205. 'Russian_Smolensk'
  206. 'Russian_Tver'
  207. 'Russian_Voronez'
  208. 'Russian_Yaroslavl'
  209. 'Saami'
  210. 'Saami_Kola'
  211. 'Saharawi'
  212. 'Sakha'
  213. 'Sakilli'
  214. 'Salar'
  215. 'Saliya_Kerala'
  216. 'Samaritan'
  217. 'Sandawe'
  218. 'Santhal'
  219. 'Sardinian'
  220. 'Satnami_Chhattisgarh'
  221. 'Saudi'
  222. 'SaudiA'
  223. 'SaudiB'
  224. 'Scottish'
  225. 'Selkup'
  226. 'Sena'
  227. 'Sengwer'
  228. 'Sephardic_Jew'
  229. 'Sephardic_Jew_o'
  230. 'Serbian'
  231. 'She'
  232. 'Sherpa'
  233. 'Shetlandic'
  234. 'Shor'
  235. 'Shor_Khakassia'
  236. 'Shor_Mountain'
  237. 'Sicilian_East'
  238. 'Sicilian_West'
  239. 'Sindhi'
  240. 'Sindhi_o'
  241. 'Slovakian'
  242. 'Slovenian'
  243. 'Somali'
  244. 'Somali_Kenya'
  245. 'Sorb_Niederlausitz'
  246. 'Spanish_Alacant'
  247. 'Spanish_Andalucia'
  248. 'Spanish_Aragon'
  249. 'Spanish_Aragon_North'
  250. 'Spanish_Asturias'
  251. 'Spanish_Baleares'
  252. 'Spanish_Barcelones'
  253. 'Spanish_Biscay'
  254. 'Spanish_Burgos'
  255. 'Spanish_Camp_de_Tarragona'
  256. 'Spanish_Canarias'
  257. 'Spanish_Cantabria'
  258. 'Spanish_Castello'
  259. 'Spanish_Castilla_La_Mancha'
  260. 'Spanish_Castilla_Y_Leon'
  261. 'Spanish_Cataluna'
  262. 'Spanish_Catalunya_Central'
  263. 'Spanish_Eivissa'
  264. 'Spanish_Extremadura'
  265. 'Spanish_Galicia'
  266. 'Spanish_Girona'
  267. 'Spanish_La_Rioja'
  268. 'Spanish_Lleida'
  269. 'Spanish_Mallorca'
  270. 'Spanish_Menorca'
  271. 'Spanish_Murcia'
  272. 'Spanish_Navarra'
  273. 'Spanish_Pais_Vasco'
  274. 'Spanish_Penedes'
  275. 'Spanish_Pirineu'
  276. 'Spanish_Soria'
  277. 'Spanish_Valencia'
  278. 'Sri_Lankan'
  279. 'Sudanese'
  280. 'Surui'
  281. 'Swedish'
  282. 'Swiss_French'
  283. 'Swiss_German'
  284. 'Swiss_Italian'
  285. 'Syed_Uttar_Pradesh_West'
  286. 'Syrian'
  287. 'Syrian_Jew'
  288. 'Tabasaran'
  289. 'Tai_Lue'
  290. 'Tajik_Ayni'
  291. 'Tajik_Hisor'
  292. 'Tajik_Kulob'
  293. 'Tajik_Yaghnobi'
  294. 'Talysh_Azerbaijan'
  295. 'Tamang'
  296. 'Tamil_Sri_Lanka'
  297. 'Tarkhan_Muslim'
  298. 'Tat_Azerbaijan'
  299. 'Tat_Dagestan_Dzhalgan'
  300. 'Tat_Dagestan_Nyugdi'
  301. 'Tatar_Crimean_steppe'
  302. 'Tatar_Kazan'
  303. 'Tatar_Lipka'
  304. 'Tatar_Mishar'
  305. 'Tatar_Siberian'
  306. 'Tatar_Siberian_Zabolotniye'
  307. 'Telugu'
  308. 'Thai'
  309. 'Tharu'
  310. 'Tharu_o1'
  311. 'Tharu_o2'
  312. 'Thiyya'
  313. 'Tibetan_Chamdo'
  314. 'Tibetan_Gangcha'
  315. 'Tibetan_Gannan'
  316. 'Tibetan_Lhasa'
  317. 'Tibetan_Nagqu'
  318. 'Tibetan_Shannan'
  319. 'Tibetan_Shigatse'
  320. 'Tibetan_Xinlong'
  321. 'Tibetan_Xunhua'
  322. 'Tibetan_Yajiang'
  323. 'Tibetan_Yunnan'
  324. 'Tikar_South'
  325. 'Tindal'
  326. 'Tlingit'
  327. 'Todzin'
  328. 'Tripuri'
  329. 'Tsez_A'
  330. 'Tsez_B'
  331. 'Tswa'
  332. 'Tu'
  333. 'Tubalar'
  334. 'Tujia'
  335. 'Tunisian'
  336. 'Tunisian_Berber_Matmata'
  337. 'Tunisian_Berber_Tamezret'
  338. 'Tunisian_Berber_Zraoua'
  339. 'Tunisian_Douz'
  340. 'Tunisian_Jew'
  341. 'Tunisian_Rbaya'
  342. 'Turkish_Antalya'
  343. 'Turkish_Aydin'
  344. 'Turkish_Balikesir'
  345. 'Turkish_Deliorman'
  346. 'Turkish_Denizli'
  347. 'Turkish_Erzurum'
  348. 'Turkish_Giresun'
  349. 'Turkish_Kayseri'
  350. 'Turkish_Konya'
  351. 'Turkish_Nevsehir'
  352. 'Turkish_Rumeli'
  353. 'Turkish_Sivas'
  354. 'Turkish_Trabzon'
  355. 'Turkmen'
  356. 'Turkmen_Uzbekistan'
  357. 'Tuvinian'
  358. 'Tyagi'
  359. 'Udi'
  360. 'Udmurt'
  361. 'Ukrainian_Chernihiv'
  362. 'Ukrainian_Dnipro'
  363. 'Ukrainian_Lviv'
  364. 'Ukrainian_Rivne'
  365. 'Ukrainian_Sumy'
  366. 'Ukrainian_Zakarpattia'
  367. 'Ukrainian_Zhytomyr'
  368. 'Ukrainian_Zhytomyr_o'
  369. 'Ulchi'
  370. 'Umbundu'
  371. 'Uttar_Pradesh_Scheduled_Castes'
  372. 'Uygur'
  373. 'Uzbek'
  374. 'Vaniya_Kerala'
  375. 'Velama'
  376. 'Vellalar'
  377. 'Vepsian'
  378. 'Vishwakarma_Kerala'
  379. 'Vizayan'
  380. 'Wa'
  381. 'Welsh'
  382. 'Wichi'
  383. 'Xibo'
  384. 'Yadav_Telugu'
  385. 'Yakut'
  386. 'Yao'
  387. 'Yemenite_Al_Bayda'
  388. 'Yemenite_Al_Jawf'
  389. 'Yemenite_Amran'
  390. 'Yemenite_Dhamar'
  391. 'Yemenite_Jew'
  392. 'Yemenite_Mahra'
  393. 'Yi'
  394. 'Yoruba'
  395. 'Yugur'
  396. 'Yukagir_Forest'
  397. 'Yukagir_Tundra'
  398. 'Yukpa'
  399. 'Yuku'
  400. 'Zapotec'
  401. 'Zhuang'
In [25]:
str_extract(modavgs$V1,"^\\w+?$") %>% unique(sort = TRUE) %>% as.character() %>% paste0(collapse = "|")
'Abazin|Abkhasian|Abkhasian_Gudauta|Adygei|Aeta|Afrikaner|Agta|Ahiska|Akha|Akhvakh|Alawite|Albanian|Alevi_Dersim|Algerian|Altaian|Amerindian_North|Ami|Andian_A|Andian_B|Arain|Armenian_Aintab|Armenian_Ararat|Armenian_Artsakh|Armenian_Erzurum|Armenian_Gesaria|Armenian_Gurin|Armenian_Hemsheni|Armenian_Parspatunik|Armenian_Syunik|Armenian_Urfa|Arora|Ashkenazi_Belarussia|Ashkenazi_Germany|Ashkenazi_Lithuania|Ashkenazi_Poland|Ashkenazi_Russia|Ashkenazi_Ukraine|Assyrian|Asur|Atayal|Australian|Austrian|Avar|Awan|Aymara|Azerbaijani_Dagestan|Azerbaijani_Iran|Azerbaijani_Republic|Azerbaijani_Republic_Gabala|Azerbaijani_Republic_Shaki|Azerbaijani_Turkey|Bagvalin|Bagvalin_o|Bahun|Bahun_o|Bai|Baiku_Yao_Guizhou|Bajo|Baka|Bakola|Balija|Balkar|Balochi|Balti|Balti_o|Baniya_Gujarat|Baniya_Punjab|Baniya_Uttar_Pradesh_Gupta|Bantu_Kenya|NA|Baoan|Bashkir|Basque_Araba|Basque_Baztan|Basque_Biscay|Basque_French|Basque_Gipuzkoa|Basque_Gipuzkoa_Southwest|Basque_Lower_Navarre|Basque_Navarre_Center|Basque_Navarre_North|Basque_Roncal|Basque_Soule|Basque_Spanish|Batak|BedouinA|BedouinB|Bedzan|Belarusian|BelgianA|BelgianB|BelgianC|Bengali_Bangladesh|Bengali_Bangladesh_SouthEast|Bengali_Bangladesh_Sylhet|Berber_Algeria|Berber_MAR_ERR|Berber_MAR_TIZ|Berber_Tunisia_Chen|Berber_Tunisia_Sen|Besermyan|Bhumihar_Bihar|Bhumij|Biaka|Birhor|Bitonga|Blang|Bolivian_Cochabamba|Bolivian_LaPaz|Bolivian_Pando|Bonan|Bonda|Bosnian|Brahmin_Gujarat|Brahmin_Gujarat_Audichya|Brahmin_Gujarat_Bardai|Brahmin_Gujarat_Nagar|Brahmin_Gujarat_o|Brahmin_Himachal_Pradesh_West|Brahmin_Jammu_Dogra|Brahmin_Karnataka_Tulu|Brahmin_Kerala_Nambudiri|Brahmin_Konkani_Catholic|Brahmin_Kumaon|Brahmin_Manipuri|Brahmin_Rajasthan|Brahmin_Tamil_Nadu|Brahmin_Tamil_Nadu_Iyengar|Brahmin_Tamil_Nadu_Iyer|Brahmin_Telugu|Brahmin_Telugu_Niyogi|Brahmin_Telugu_Vaidiki|Brahmin_UP_Awadh_Saryupareen|Brahmin_UP_Kanyakubja|Brahmin_UP_Lucknow|Brahmin_UP_Lucknow_o|Brahmin_Uttar_Pradesh_East|Brahmin_Uttar_Pradesh_East_o|Brahmin_West_Bengal|Brahui|Bukharian_Jew|Bulala|Bulgarian|Bunt_Kerala|Burmese|Burusho|Buryat|Cachi|Cambodian|Cameroon_Aghem|Cameroon_Bafut|Cameroon_Bakoko|Cameroon_Bangwa|Cameroon_Mbo|Chamalin|Chamar_Uttar_Pradesh|Chamar_Uttar_Pradesh_o|Changana|Changshan_Yao_Guizhou|Chechen|Chenchu|Cherkes|Chipewyan|Chopi|Chukchi|Chuvash|Circassian|Cochin_Jew_A|Cochin_Jew_B|Colla|Cossack_Kuban|Cossack_Ukrainian|Cree|Croatian|Cypriot|Czech|Dai|Damai|Danish|Darginian|Datog|Daur|Dharkar|Dinka|Dolgan|Dong_Guizhou|Dong_Hunan|Dongxiang|Druze|Dungan|Dusadh|Dusun|Dutch|Egyptian|Elmolo|EmiratiA|EmiratiB|EmiratiC|English|English_Cornwall|Eritrean|Erzya|Esan_Nigeria|Eskimo|Eskimo_Chaplin|Eskimo_Naukan|Eskimo_Sireniki|Estonian|Ethiopian_Afar|Ethiopian_Agaw|Ethiopian_Amhara|Ethiopian_Anuak|Ethiopian_Ari|Ethiopian_Ari_blacksmith|Ethiopian_Ari_cultivator|Ethiopian_Gumuz|Ethiopian_Jew|Ethiopian_Mursi|Ethiopian_Oromo|Ethiopian_Tigray|Ethiopian_Wolayta|Even|Evenk|Ezhava|Ezid|Finnish_Central|Finnish_East|Finnish_North|Finnish_Southeast|Finnish_Southwest|French_Alsace|French_Auvergne|French_Bearn|French_Bigorre|French_Brittany|French_Chalosse|French_Corsica|French_Nord|French_Occitanie|French_Paris|French_Provence|French_South|Fulani|Fulani_Ziniare|Gadaba|Gagauz|Gambian|Ganguela|Gelao|Georgian_Ajar|Georgian_Guria|Georgian_Imer|Georgian_Javakheti|Georgian_Jew|Georgian_Kakh|Georgian_Kart|Georgian_Khevs|Georgian_Laz|Georgian_Lechkhumi|Georgian_Megr|Georgian_Meskheti|Georgian_Mtiuleti|Georgian_NorthEast|Georgian_Ratcha|Georgian_Samtckhe|Georgian_Svaneti|Georgian_Tush|Georgian_West|German|German_East|German_Erlangen|German_Hamburg|Gond|Greek_Achaea|Greek_Apulia|Greek_Arcadia|Greek_Argolis|Greek_Cappadocia|Greek_Central_Anatolia|Greek_Central_Macedonia|Greek_Corinthia|Greek_Crete|Greek_Crete_Chania|Greek_Crete_Heraklion|Greek_Crete_Lasithi|Greek_Cyclades_Amorgos|Greek_Cyclades_Kea|Greek_Cyclades_Milos|Greek_Cyclades_Tinos|Greek_Deep_Mani|Greek_Dodecanese|Greek_Dodecanese_Rhodes|Greek_East_Macedonia_and_Thrace|Greek_East_Taygetos|Greek_Elis|Greek_Izmir|Greek_Kos|Greek_Laconia|Greek_Macedonia|Greek_Messenia|Greek_North_Tsakonia|Greek_Peloponnese|Greek_South_Tsakonia|Greek_Thessaly|Greek_Trabzon|Greek_West_Taygetos|Greenlander_East|Greenlander_West|Gujar_Punjab|Gujar_Rajasthan|Gujar_Swat|Gujar_Swat_o|Gujarati|Gujarati_Bharuch_Muslim|Gurung|Hadza|Hakkipikki|Han_Chongqing|Han_Fujian|Han_Guangdong|Han_Guizhou|Han_Henan|Han_Hubei|Han_Jiangsu|Han_Shandong|Han_Shanghai|Han_Shanxi|Han_Sichuan|Han_Zhejiang|Hani|Hawaiian|Hazara|Hezhen|Hinukh|Hmong|Ho|Htin_Mal|Hui|Hui_Guizhou|Huichol|Hungarian|Hunzib|Icelandic|Igbo|Igorot|Indonesian_Bali|Indonesian_Java|Ingrian|Ingushian|Iranian|Iranian_Bandari|Iranian_Fars|Iranian_Jew|Iranian_Lor|Iranian_Mazandarani|Iranian_Persian_Shiraz|Iranian_Zoroastrian|Iraqi|Iraqi_Jew|Iraqw|Irish|Irula|Italian_Abruzzo|Italian_Aosta_Valley|Italian_Apulia|Italian_Basilicata|Italian_Bergamo|Italian_Calabria|Italian_Campania|Italian_Emilia|Italian_Friuli_Venezia_Giulia_Sappada|Italian_Jew|Italian_Lazio|Italian_Liguria|Italian_Lombardy|Italian_Marche|Italian_Molise|Italian_Northeast|Italian_Piedmont|Italian_Trentino_Alto_Adige|Italian_Tuscany|Italian_Umbria|Italian_Veneto|Itelmen|Jamatia|Japanese|Jarawa|Jat_Haryana|Jat_Pahari|Jat_Punjab_Muslim|Jat_Punjab_Sikh|Jat_Uttar_Pradesh|Jehai|Jordanian|Ju_hoan_North|Juang|Kaba|Kabardin|Kadar|Kaitag|Kalash|Kalmyk|Kamboj|Kamboj_o|Kamma|Kanjar|Karachay|Karaite_Egypt|Karakalpak|Karata|Karelian|Karen_Sgaw|Karitiana|Kashmiri_India_Muslim|Kashmiri_Pakistan|Kashmiri_Pakistan_o|Kashmiri_Pandit|Kazakh|Kazakh_China|Kazakh_Xinjiang|Ket|Khakass|Khakass_Kachins|Khamnegan|Khanty|Khatri|Khatri_o|Khmer|Kho_Singanali|Khomani_San|Khonda_Dora|Kikuyu|Kinh_Vietnam|Kirghiz|Kirghiz_China|Knanaya|Kohistani|Koinanbe|Kol|Koli_Gujarat|Komi|Kongo|Konkani_Christian_A|Konkani_Christian_B|Korean|Korean_Antu|Korwa|Koryak|Kosipe|Kshatriya_Uttar_Pradesh_East|Kshatriya_Uttar_Pradesh_East_o|Kubachinian|Kumyk|Kurdish|Kurdish_Jew|Kurichiya|Kurumba|Kusunda|Kuy_Suay|Lahu|Lak|Laka|Lao|Latvian|Lawa|Lebanese_Christian|Lebanese_Druze|Lebanese_Muslim|Lebbo|Lemande|Lezgin|Li|Libyan|Libyan_Jew|Lithuanian_PA|Lithuanian_PZ|Lithuanian_RA|Lithuanian_SZ|Lithuanian_VA|Lithuanian_VZ|Luhya_Kenya|Luo|Luzon|Macedonian|Mada|Madagascar_Mikea|Madagascar_Temoro|Madagascar_Vezo|Madiga|Magar|Makhuwa|Makrani|Mala|Malay|Malayan|Maltese|Manchu_Bijie|Manchu_Jinsha|Manchu_Jinzhou|Manchu_Liaoning|Mandenka|Maniq|Maniyani_Kerala|Mansi|Manyika|Maonan|Maori|Maratha|Mari|Masai|Mayan|Mbuti|Mende_Sierra_Leone|Miao|Miao_Leishan|Miao_Songtao|Mixe|Mixtec|Mlabri|Mogush|Moksha|Moldovan|Moldovan_o|Mon|Mongol_Bijie|Mongol_IMAR|Mongol_Inner_Mongolia|Mongol_Xinjiang|Mongola|Mongolian|Montenegrin|Moroccan|Moroccan_Jew|Moroccan_North|Moroccan_South|Mountain_Jew|Mountain_Jew_o|Mozabite|Mulam|Murut|Mwani|Nahua|Nair|Nanai|Nasoi|Nasrani|Naxi|Ndau|Negidal|Nenets|Nepali_Sherpa_Rolwaling|Nepali_Tamang_Simigaon|Nepali_Tamang_Tashinam|Newar|Nganassan|Ngumba|Nivkh|Nogai|North_Kannadi|North_Ossetian|Norwegian|Nyah_Kur|Nyaneka|Nyanja|Ogiek|Onge|Orcadian|Oroqen|Ossetian|Palestinian|Palestinian_Beit_Sahour|Pallan|Pamiri_Badakhshan|Pamiri_Ishkashim|Pamiri_Rushan|Pamiri_Sarikoli|Pamiri_Sarikoli_China|Pamiri_Shugnan|Pamiri_Wakhi|Paniya|Papuan|Parsi_India|Parsi_Pakistan|Pashtun_Kandahar|Pashtun_Kurram|Pashtun_North_Afghanistan|Pashtun_Tarkalani|Pashtun_Uthmankhel|Pashtun_Yusufzai|Pathan_Bhopal|Piapoco|Pima|Piramalai_Kallar|Poduval_Kerala|Polish|Polish_Kashubian|Polish_Silesian|Portuguese|Pulaya_Kerala|Pulliyar|Pumi|Punjabi_Christian_India|Punjabi_Hindu_India|Punjabi_Lahore|Punjabi_Muslim_India|Punjabi_Sikh_India|Qiang_Danba|Qiang_Daofu|QingYao_Guizhou|Quechua|Rai|Rajput_Garhwal|Rajput_Jammu_Pahari_Pakistan|Rajput_Madhya_Pradesh|Rajput_Potohar|Rajput_Rajasthan|Rajput_Rajasthan_o1|Rajput_Rajasthan_o2|Ratlub|Reddy|Relli|Rendille|Riang|Rohingya|Roma_Balkans|Roma_Barcelona|Roma_Bilbao|Roma_Granada|Roma_Madrid|Roma_Porto|Romanian|Romaniote_Jew|Ronga|Ror|Rumelia_East|Russian_Belgorod|Russian_Kaluga|Russian_Kostroma|Russian_Krasnoborsky|Russian_Kursk|Russian_Leshukonsky|Russian_Orel|Russian_Pinega|Russian_Pinezhsky|Russian_Pskov|Russian_Ryazan|Russian_Smolensk|Russian_Tver|Russian_Voronez|Russian_Yaroslavl|Saami|Saami_Kola|Saharawi|Sakha|Sakilli|Salar|Saliya_Kerala|Samaritan|Sandawe|Santhal|Sardinian|Satnami_Chhattisgarh|Saudi|SaudiA|SaudiB|Scottish|Selkup|Sena|Sengwer|Sephardic_Jew|Sephardic_Jew_o|Serbian|She|Sherpa|Shetlandic|Shor|Shor_Khakassia|Shor_Mountain|Sicilian_East|Sicilian_West|Sindhi|Sindhi_o|Slovakian|Slovenian|Somali|Somali_Kenya|Sorb_Niederlausitz|Spanish_Alacant|Spanish_Andalucia|Spanish_Aragon|Spanish_Aragon_North|Spanish_Asturias|Spanish_Baleares|Spanish_Barcelones|Spanish_Biscay|Spanish_Burgos|Spanish_Camp_de_Tarragona|Spanish_Canarias|Spanish_Cantabria|Spanish_Castello|Spanish_Castilla_La_Mancha|Spanish_Castilla_Y_Leon|Spanish_Cataluna|Spanish_Catalunya_Central|Spanish_Eivissa|Spanish_Extremadura|Spanish_Galicia|Spanish_Girona|Spanish_La_Rioja|Spanish_Lleida|Spanish_Mallorca|Spanish_Menorca|Spanish_Murcia|Spanish_Navarra|Spanish_Pais_Vasco|Spanish_Penedes|Spanish_Pirineu|Spanish_Soria|Spanish_Valencia|Sri_Lankan|Sudanese|Surui|Swedish|Swiss_French|Swiss_German|Swiss_Italian|Syed_Uttar_Pradesh_West|Syrian|Syrian_Jew|Tabasaran|Tai_Lue|Tajik_Ayni|Tajik_Hisor|Tajik_Kulob|Tajik_Yaghnobi|Talysh_Azerbaijan|Tamang|Tamil_Sri_Lanka|Tarkhan_Muslim|Tat_Azerbaijan|Tat_Dagestan_Dzhalgan|Tat_Dagestan_Nyugdi|Tatar_Crimean_steppe|Tatar_Kazan|Tatar_Lipka|Tatar_Mishar|Tatar_Siberian|Tatar_Siberian_Zabolotniye|Telugu|Thai|Tharu|Tharu_o1|Tharu_o2|Thiyya|Tibetan_Chamdo|Tibetan_Gangcha|Tibetan_Gannan|Tibetan_Lhasa|Tibetan_Nagqu|Tibetan_Shannan|Tibetan_Shigatse|Tibetan_Xinlong|Tibetan_Xunhua|Tibetan_Yajiang|Tibetan_Yunnan|Tikar_South|Tindal|Tlingit|Todzin|Tripuri|Tsez_A|Tsez_B|Tswa|Tu|Tubalar|Tujia|Tunisian|Tunisian_Berber_Matmata|Tunisian_Berber_Tamezret|Tunisian_Berber_Zraoua|Tunisian_Douz|Tunisian_Jew|Tunisian_Rbaya|Turkish_Antalya|Turkish_Aydin|Turkish_Balikesir|Turkish_Deliorman|Turkish_Denizli|Turkish_Erzurum|Turkish_Giresun|Turkish_Kayseri|Turkish_Konya|Turkish_Nevsehir|Turkish_Rumeli|Turkish_Sivas|Turkish_Trabzon|Turkmen|Turkmen_Uzbekistan|Tuvinian|Tyagi|Udi|Udmurt|Ukrainian_Chernihiv|Ukrainian_Dnipro|Ukrainian_Lviv|Ukrainian_Rivne|Ukrainian_Sumy|Ukrainian_Zakarpattia|Ukrainian_Zhytomyr|Ukrainian_Zhytomyr_o|Ulchi|Umbundu|Uttar_Pradesh_Scheduled_Castes|Uygur|Uzbek|Vaniya_Kerala|Velama|Vellalar|Vepsian|Vishwakarma_Kerala|Vizayan|Wa|Welsh|Wichi|Xibo|Yadav_Telugu|Yakut|Yao|Yemenite_Al_Bayda|Yemenite_Al_Jawf|Yemenite_Amran|Yemenite_Dhamar|Yemenite_Jew|Yemenite_Mahra|Yi|Yoruba|Yugur|Yukagir_Forest|Yukagir_Tundra|Yukpa|Yuku|Zapotec|Zhuang'
In [16]:
avgs = avgs %>% dplyr::filter(!str_detect(V1,"^(ARG|AUS|BHS|BLZ|BRA|BWA|CAN|CHL|CHN|CMR|COG|CUW|DOM|ETH|GUM|HTI|IDN|IND|JPN|KAZ|KEN|KGZ|KOR|LAO|MEX|MNG|MWI|MYS)"))
avgs = avgs %>% dplyr::filter(!str_detect(V1,"^(NPL|PAK|PAN|PER|PYF|Saka|Sarg|SDN|THA|TJK|TKM|TON|TWN|TZA|UGA|USA|UZB|VEN|VNM|VUT|ZAF)"))
avgs = avgs %>% dplyr::filter(!str_detect(V1,"_o$"))
avgs = avgs %>% dplyr::filter(!str_detect(V1,"low"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Abazin|Abkhasian|Adygei|Aeta|Afrikaner|Agta|Ahiska|Akha|Akhvakh|Alawite|Alevi|Algerian|Altaian|Amerindian|Ami|Andian|Arain)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Armenian_[AGHP]|Arora|Assyrian|Asur|Atayal|Australian|Awan|Aymara|Azerbaijani_R|Basque_[ABGLR]|Batak|Bedzan|Bengali|Berber_[AM])"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Arora|Assyrian|Asur|Atayal|Australian|Awan|Aymara|Azerbaijani_R|Batak|Bedzan|Bengali|Bagvalin|Bahun|Baiku|Balti|Baniya|Bantu)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Besermyan|Bhumi|Biaka|Birhor|Bitonga|Blang|Bolivian|Bonan|Bonda|Brahmin|Brahui|Bulala|Bunt|Burmese|Burusho|Buryat|Cachi|Cambodian)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Cameroon|Chama|Chan|Chen|Chipewyan|Chopi|Chukchi|Chuvash|Circassian|Cochin|Colla|Cree|Dai|Damai|Darginian|Datog|Daur|Dharkar|Dinka)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Dolgan|Dong|Dungan|Dusadh|Dusun|Elmolo|Eritrean|Esan_Nigeria|Eskimo|Ethiopian|Even|Evenk|Ezhava|Ezid)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(French_Au|French_B[ei]|French_[COP]|Fulani|Gadaba|Gagauz|Gambian|Ganguela|Gelao|Georgian_[GIJKLMRST]|Gond|Greek_[ADEIKLMNSTW])"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Gujar|Gurung|Hadza|Hakkipikki|Han|Hawaiian|Hazara|Hezhen|Hinukh|Hmong|Ho|Htin_Mal|Hui|Hunzib|Igbo|Igorot|Indonesian|Iranian_[BJLMP])"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Iraq|Irula|Italian_[ABEFJLM]|Itelmen|Jamatia|Japanese|Jarawa|Jat|Jehai|Ju_hoan_North|Juang|Kaba|Kadar|Kaitag|Kalash|Kalmyk|Kamboj)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Irula|Itelmen|Jamatia|Japanese|Jarawa|Jat|Jehai|Ju_hoan_North|Juang|Kaba|Kadar|Kaitag|Kalash|Kalmyk|Kamboj)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Kamma|Kanjar|Karachay|Karaite_Egypt|Karakalpak|Karata|Karen_Sgaw|Karitiana|Kashmiri|Kazakh|Ket|Khakass|Khamnegan|Khanty|Khatri)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Khmer|Kho|Kikuyu|Kinh_Vietnam|Kirghiz|Knanaya|Kohistani|Koinanbe|Kol|Komi|Kongo|Konkani|Korean|Korwa|Koryak|Kosipe)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Kshatriya|Kubachinian|Kumyk|Kurdish|Kurichiya|Kurumba|Kusunda|Kuy_Suay|Lahu|Laka|Lao|Lawa|Lebbo|Lemande|Lezgin|Li|Libyan)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Lithuanian_[RP]|Luhya_Kenya|Luo|Luzon|Macedonian|Mada|Madagascar|Madiga|Magar|Makhuwa|Makrani|Mala|Malay|Manchu|Mandenka|Maniq)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Maniyani_Kerala|Mansi|Manyika|Maonan|Maori|Maratha|Mari|Masai|Mayan|Mbuti|Mende_Sierra_Leone|Miao|Mix|Mlabri|Mogush|Mon|Moroccan_)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Mountain_|Mozabite|Mulam|Murut|Mwani|Nahua|Nair|Nanai|Nasoi|Nasrani|Naxi|Ndau|Negidal|Nenets|Nepali|Newar|Nganassan|Ngumba|Nivkh|Nogai)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(North|Nyah_Kur|Nyan|Ogiek|Onge|Orcadian|Oroqen|Palestinian_|Pallan|Pamiri|Paniya|Papuan|Parsi|Pashtun|Pathan_Bhopal|Piapoco|Pima)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Piramalai_Kallar|Poduval_Kerala|Polish_|Pul|Pumi|Punj|Qiang|Qing|Quechua|Rai|Raj|Ratlub|Reddy|Relli|Rendille|Riang|Rohingya)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Roma_|Romaniote|Ronga|Ror|Rumelia_East|Russian_[BKRSTVY]|Saharawi|Sakha|Sakilli|Salar|Saliya_Kerala|Samaritan|Sandawe|Santhal)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Satnami_Chhattisgarh|Saudi[AB]|Selkup|Sena|Sengwer|She|Shor|Sindhi|Somali|Spanish_[ABCEGLPS]|Sri_Lankan|Sudanese|Surui)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Syed|Tabasaran|Tai_Lue|Tajik|Talysh_Azerbaijan|Tamang|Tamil_Sri_Lanka|Tarkhan_Muslim|Tat|Telugu|Thai|Tharu|Thiyya|Tibetan)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Tikar_South|Tindal|Tlingit|Todzin|Tripuri|Tsez|Tswa|Turkish_[ABDEGKNS]|Turkmen|Tuvinian|Tyagi|Udi|Udmurt|Ulchi|Umbundu)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Uttar|Uygur|Uzbek|Vaniya_Kerala|Velama|Vellalar|Vishwakarma_Kerala|Vizayan|Wa|Welsh|Wichi|Xibo|Yadav_Telugu|Yakut|Yao|Yemenite)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"^(Yi|Yoruba|Yugur|Yukagir|Yukpa|Yuku|Zapotec|Zhuang)"))
modavgs = modavgs %>% dplyr::filter(!str_detect(V1,"_o$"))
In [17]:
avgs$V1 %>% unique(sort = TRUE)
modavgs$V1 %>% unique(sort = TRUE)
  1. 'ARM_Areni_C'
  2. 'ARM_LBA'
  3. 'ARM_Lchashen_MBA'
  4. 'ARM_MBA'
  5. 'AUT_IA_La_Tene'
  6. 'AUT_LBK_N'
  7. 'Baltic_EST_BA'
  8. 'Baltic_EST_IA'
  9. 'Baltic_EST_MA'
  10. 'Baltic_EST_Narva'
  11. 'Baltic_LTU_BA'
  12. 'Baltic_LTU_Meso'
  13. 'Baltic_LTU_Narva'
  14. 'Baltic_LVA_BA'
  15. 'Baltic_LVA_HG'
  16. 'Baltic_LVA_MN'
  17. 'BEL_GoyetQ116-1'
  18. 'BEL_GoyetQ2'
  19. 'Bell_Beaker_Bavaria'
  20. 'Bell_Beaker_CHE'
  21. 'Bell_Beaker_CZE'
  22. 'Bell_Beaker_CZE_early'
  23. 'Bell_Beaker_CZE_late'
  24. 'Bell_Beaker_England'
  25. 'Bell_Beaker_England_EBA'
  26. 'Bell_Beaker_England_highEEF'
  27. 'Bell_Beaker_England_highWHG'
  28. 'Bell_Beaker_FRA'
  29. 'Bell_Beaker_FRA_C'
  30. 'Bell_Beaker_HUN'
  31. 'Bell_Beaker_HUN_EBA'
  32. 'Bell_Beaker_Iberia'
  33. 'Bell_Beaker_Iberia_C'
  34. 'Bell_Beaker_ITA'
  35. 'Bell_Beaker_Mittelelbe-Saale'
  36. 'Bell_Beaker_Mittelelbe-Saale_contam'
  37. 'Bell_Beaker_NLD'
  38. 'Bell_Beaker_POL'
  39. 'Bell_Beaker_Rhine-Main'
  40. 'Bell_Beaker_Scotland'
  41. 'BGR_Bacho_Kiro_MUP'
  42. 'BGR_Beli_Breyag_EBA'
  43. 'BGR_C'
  44. 'BGR_Dzhulyunitsa_N'
  45. 'BGR_EBA'
  46. 'BGR_EBA_contam'
  47. 'BGR_IA'
  48. 'BGR_Krepost_N'
  49. 'BGR_Late_C'
  50. 'BGR_Middle_C'
  51. 'BGR_MLBA'
  52. 'BGR_MP_N'
  53. 'BGR_N'
  54. 'BGR_Varna_C'
  55. 'Canary_Islands_Guanche'
  56. 'Channel_Islands_EIA'
  57. 'Channel_Islands_IA'
  58. 'Channel_Islands_LN'
  59. 'Channel_Islands_MN'
  60. 'CHE_EBA'
  61. 'CHE_FN'
  62. 'CHE_FN_steppe_contam'
  63. 'CHE_IA'
  64. 'CHE_LN'
  65. 'CHE_LN_contam'
  66. 'CHE_LN_steppe'
  67. 'CHE_MN'
  68. 'Corded_Ware_Baltic'
  69. 'Corded_Ware_Baltic_early'
  70. 'Corded_Ware_CHE'
  71. 'Corded_Ware_CZE'
  72. 'Corded_Ware_CZE_early'
  73. 'Corded_Ware_CZE_late'
  74. 'Corded_Ware_CZE_noSteppe'
  75. 'Corded_Ware_DEU'
  76. 'Corded_Ware_POL'
  77. 'Corded_Ware_POL_early'
  78. 'Corded_Ware_Proto-Unetice_POL'
  79. 'CZE_Bilina_BA'
  80. 'CZE_C'
  81. 'CZE_C_oSteppe'
  82. 'CZE_Early_Slav'
  83. 'CZE_EBA_Unetice'
  84. 'CZE_EE'
  85. 'CZE_EN_LBK'
  86. 'CZE_Hallstatt_Bylany'
  87. 'CZE_IA_Hallstatt'
  88. 'CZE_IA_La_Tene'
  89. 'CZE_IA_La_Tene_Hallstatt'
  90. 'CZE_IA_La_Tene_oFennoscandian'
  91. 'CZE_Krems_UP'
  92. 'CZE_LBA_Knoviz'
  93. 'CZE_LBA_Knoviz_o2'
  94. 'CZE_LBA_Knoviz_o3'
  95. 'CZE_LN'
  96. 'CZE_MBA_Tumulus'
  97. 'CZE_ME_Baden'
  98. 'CZE_ME_GAC'
  99. 'CZE_ME_Rivnac'
  100. 'CZE_MN'
  101. 'CZE_N'
  102. 'CZE_N_oWHG'
  103. 'CZE_N_possible'
  104. 'CZE_PE'
  105. 'CZE_Unetice_C'
  106. 'CZE_Unetice_EBA'
  107. 'CZE_Unetice_preC'
  108. 'CZE_Vestonice16'
  109. 'DEU_Alberstedt_LN'
  110. 'DEU_Anselfingen_FN'
  111. 'DEU_Baalberge_MN'
  112. 'DEU_BenzigerodeHeimburg_LN'
  113. 'DEU_Blatterhohle_MN'
  114. 'DEU_Esperstedt_MN'
  115. 'DEU_Halberstadt_LBA'
  116. 'DEU_Karsdorf_LN'
  117. 'DEU_LBK_HBS'
  118. 'DEU_LBK_KD'
  119. 'DEU_LBK_SCH'
  120. 'DEU_LBK_SMH'
  121. 'DEU_LBK_UW'
  122. 'DEU_Lech_BBC'
  123. 'DEU_Lech_EBA'
  124. 'DEU_Lech_EBA_contam'
  125. 'DEU_Lech_MBA'
  126. 'DEU_MA_ACD_Baiuvaric'
  127. 'DEU_MA_ACD_Nordic'
  128. 'DEU_MA_ACD_Ostrogothic'
  129. 'DEU_MA_Alemannic'
  130. 'DEU_MA_Alemannic_o1'
  131. 'DEU_MA_Alemannic_o2'
  132. 'DEU_MA_Baiuvaric'
  133. 'DEU_MA_Erfurt1'
  134. 'DEU_MA_Erfurt2'
  135. 'DEU_MA_Erfurt3'
  136. 'DEU_MA_Krakauer_Berg'
  137. 'DEU_Meso_BDB'
  138. 'DEU_Meso_TGM'
  139. 'DEU_Roman'
  140. 'DEU_Singen_EBA'
  141. 'DEU_Singen_EIA'
  142. 'DEU_Tollense_BA'
  143. 'DEU_Tollense_BA_o2'
  144. 'DEU_Unetice_EBA'
  145. 'DEU_Wartberg_MN'
  146. 'DNK_BA'
  147. 'DNK_Djursland_SGC'
  148. 'DNK_LN'
  149. 'DNK_MN_B'
  150. 'EGY_Hellenistic_contam'
  151. 'EGY_Late_Period'
  152. 'England_C_EBA'
  153. 'England_C_EBA_highEEF'
  154. 'England_CA_EBA'
  155. 'England_EastYorkshire_EIA'
  156. 'England_EastYorkshire_IA'
  157. 'England_EastYorkshire_LIA'
  158. 'England_EastYorkshire_MIA'
  159. 'England_EastYorkshire_MIA_LIA'
  160. 'England_EBA'
  161. 'England_EBA_Bell_Beaker'
  162. 'England_EBA_highEEF'
  163. 'England_EIA'
  164. 'England_EIA_highEEF'
  165. 'England_EMBA'
  166. 'England_IA'
  167. 'England_IA_EarlyMedieval'
  168. 'England_LBA'
  169. 'England_LBA_highEEF'
  170. 'England_LIA'
  171. 'England_LIA_highEEF'
  172. 'England_MBA'
  173. 'England_MBA_highEEF'
  174. 'England_Mesolithic'
  175. 'England_MIA'
  176. 'England_MIA_highEEF'
  177. 'England_MIA_LIA'
  178. 'England_N'
  179. 'England_Roman'
  180. 'England_Saxon'
  181. 'England_Trumpington_N'
  182. 'FIN_Levanluhta_IA'
  183. 'FRA_Alsace_EBA'
  184. 'FRA_Alsace_EN'
  185. 'FRA_Alsace_IA1'
  186. 'FRA_Alsace_IA2'
  187. 'FRA_Alsace_LBA'
  188. 'FRA_Alsace_MN'
  189. 'FRA_Champagne_EBA'
  190. 'FRA_Champagne_IA2'
  191. 'FRA_Champagne_MN'
  192. 'FRA_EBA'
  193. 'FRA_EN_PEN'
  194. 'FRA_ENMN_LBR'
  195. 'FRA_FN_Lingolsheim_steppe'
  196. 'FRA_Hauts_De_France_IA2'
  197. 'FRA_Hauts_De_France_LN'
  198. 'FRA_Hauts_De_France_MN'
  199. 'FRA_La_Clape_LN_EBA_Veraza'
  200. 'FRA_La_Clape_LN_EBA_Veraza_oSteppe'
  201. ⋯
  202. 'RUS_Shamanka_EBA'
  203. 'RUS_Shamanka_N'
  204. 'RUS_Sidelkino_HG'
  205. 'RUS_Sintashta_MLBA'
  206. 'RUS_Sintashta_MLBA_contam'
  207. 'RUS_Sintashta_MLBA_o1'
  208. 'RUS_Sintashta_MLBA_o2'
  209. 'RUS_Sintashta_MLBA_o3'
  210. 'RUS_Sosonivoy_HG'
  211. 'RUS_Srubnaya_Alakul_MLBA'
  212. 'RUS_Srubnaya_MLBA'
  213. 'RUS_Steppe_Maykop'
  214. 'RUS_Sunghir'
  215. 'RUS_Sunghir_MA'
  216. 'RUS_Tagar'
  217. 'RUS_Trans-Baikal_BA'
  218. 'RUS_Trans-Baikal_N'
  219. 'RUS_Tuva_Aldy_Bel_IA'
  220. 'RUS_Tyumen_HG'
  221. 'RUS_Ust_Belaya'
  222. 'RUS_Ust_Belaya_Angara'
  223. 'RUS_Ust_Ida_EBA'
  224. 'RUS_Ust_Ida_LN'
  225. 'RUS_Ust_Ishim'
  226. 'RUS_Ust_Kyakhta'
  227. 'RUS_Veretye_Meso'
  228. 'RUS_Volga-Kama_N'
  229. 'RUS_Vologda_Veretye_Meso'
  230. 'RUS_Vonyuchka_En'
  231. 'RUS_Yakutia_LUP'
  232. 'RUS_Yakutia_Ymyiakhtakh_LN'
  233. 'RUS_Yana_MA'
  234. 'RUS_Yana_UP'
  235. 'RUS_Yankovsky_IA'
  236. 'RUS_Zevakino_Chilikta_IA'
  237. 'Sarmatian_KAZ'
  238. 'Sarmatian_KAZ_Aigyrly'
  239. 'Sarmatian_KAZ_Aktobe'
  240. 'Sarmatian_KAZ_Bisoba'
  241. 'Sarmatian_MDA'
  242. 'Sarmatian_RUS_Caspian_steppe'
  243. 'Sarmatian_RUS_Caucasus'
  244. 'Sarmatian_RUS_Pokrovka'
  245. 'Sarmatian_RUS_Urals'
  246. 'Sarmatian_Segizsay'
  247. 'Scotland_C_EBA_highEEF_lc'
  248. 'Scotland_C_EBA_mediumhighEEF'
  249. 'Scotland_CA_EBA'
  250. 'Scotland_EIA'
  251. 'Scotland_IA'
  252. 'Scotland_LBA'
  253. 'Scotland_LIA'
  254. 'Scotland_MBA'
  255. 'Scotland_Megalithic'
  256. 'Scotland_MIA'
  257. 'Scotland_MIA_LIA'
  258. 'Scotland_N'
  259. 'Scotland_Pictish_EMA'
  260. 'Scotland_Skye_IA'
  261. 'Scotland_Skye_N'
  262. 'Scythian_HUN'
  263. 'Scythian_MDA'
  264. 'Scythian_RUS_Urals'
  265. 'Scythian_UKR'
  266. 'SRB_BA_Maros'
  267. 'SRB_Iron_Gates_HG'
  268. 'SRB_Mokrin_EBA_Maros'
  269. 'SRB_Mokrin_EBA_Maros_oAegean'
  270. 'SRB_N'
  271. 'SRB_Starcevo_N'
  272. 'SVK_EBA'
  273. 'SVK_IA_Vekerzug'
  274. 'SVK_LIA'
  275. 'SVK_LIA_La_Tene'
  276. 'SVK_Poprad_MA'
  277. 'SVN_EIA'
  278. 'SVN_LBA'
  279. 'SVN_LBA_EIA'
  280. 'SVN_MBA'
  281. 'SVN_MIA_oSouth'
  282. 'SWE_Ajvide_PWC_BAC'
  283. 'SWE_BA'
  284. 'SWE_Battle_Axe'
  285. 'SWE_Hemmor_PWC_BAC'
  286. 'SWE_IA'
  287. 'SWE_LN'
  288. 'SWE_Megalithic_Ansarve'
  289. 'SWE_Meso'
  290. 'SWE_Motala_HG'
  291. 'SWE_Ollsjo_BA'
  292. 'SWE_PWC_NHG'
  293. 'SWE_TRB'
  294. 'SWE_Vasterbjers_PWC_BAC'
  295. 'SWE_Viking_Age_Sigtuna'
  296. 'SYR_Ebla_EMBA'
  297. 'SYR_Tell_Qarassa_Early_Antiquity'
  298. 'TUR_Alalakh_MLBA'
  299. 'TUR_Arslantepe_EBA'
  300. 'TUR_Arslantepe_LC'
  301. 'TUR_Barcin_C'
  302. 'TUR_Barcin_N'
  303. 'TUR_Boncuklu_N'
  304. 'TUR_Buyukkaya_EC'
  305. 'TUR_Camlibel_Tarlasi_LC'
  306. 'TUR_Catalhoyuk_N_Ceramic'
  307. 'TUR_IA'
  308. 'TUR_Ikiztepe_LC'
  309. 'TUR_Isparta_EBA'
  310. 'TUR_Kaman-Kalehoyuk_MLBA'
  311. 'TUR_Kumtepe_N'
  312. 'TUR_Ottoman'
  313. 'TUR_Ovaoren_EBA'
  314. 'TUR_Pinarbasi_HG'
  315. 'TUR_Tell_Kurdu_EC'
  316. 'TUR_Tell_Kurdu_MC'
  317. 'TUR_Tepecik_Ciftlik_N'
  318. 'TUR_Titris_Hoyuk_EBA'
  319. 'UKR_Catacomb'
  320. 'UKR_Chernyakhiv_Legedzine'
  321. 'UKR_Chernyakhiv_Shyshaky'
  322. 'UKR_Cimmerian'
  323. 'UKR_Dereivka_I_En1'
  324. 'UKR_Dereivka_I_En2'
  325. 'UKR_EBA'
  326. 'UKR_Globular_Amphora'
  327. 'UKR_MBA'
  328. 'UKR_Meso'
  329. 'UKR_N'
  330. 'UKR_Srubnaya_MLBA'
  331. 'UKR_Trypillia'
  332. 'UKR_Trypillia_En'
  333. 'VK2020_DNK_Funen_VA'
  334. 'VK2020_DNK_Jutland_IA'
  335. 'VK2020_DNK_Jutland_VA'
  336. 'VK2020_DNK_Langeland_VA'
  337. 'VK2020_DNK_Sealand_EVA'
  338. 'VK2020_DNK_Sealand_IA'
  339. 'VK2020_DNK_Sealand_LNBA'
  340. 'VK2020_DNK_Sealand_VA'
  341. 'VK2020_England_Dorset_VA'
  342. 'VK2020_England_Oxford_VA'
  343. 'VK2020_EST_Saaremaa_EVA'
  344. 'VK2020_Faroes_EM'
  345. 'VK2020_Faroes_VA'
  346. 'VK2020_GreenlandE_VA'
  347. 'VK2020_GreenlandW_VA'
  348. 'VK2020_IRL_Dublin_VA'
  349. 'VK2020_IRL_Eyrephort_VA'
  350. 'VK2020_ISL_Hofstadir_VA'
  351. 'VK2020_ISL_Hringsdalur_VA'
  352. 'VK2020_ISL_Ingiridarstadir_VA'
  353. 'VK2020_Isle_Of_Man_VA'
  354. 'VK2020_ITA_Foggia_MA'
  355. 'VK2020_NOR_Mid_IA'
  356. 'VK2020_NOR_Mid_MA'
  357. 'VK2020_NOR_Mid_VA'
  358. 'VK2020_NOR_North_IA'
  359. 'VK2020_NOR_North_LN_HG'
  360. 'VK2020_NOR_North_VA'
  361. 'VK2020_NOR_North_VA_o1'
  362. 'VK2020_NOR_North_VA_o2'
  363. 'VK2020_NOR_South_IA'
  364. 'VK2020_NOR_South_VA'
  365. 'VK2020_POL_Bodzia_VA'
  366. 'VK2020_POL_Cedynia_MA'
  367. 'VK2020_POL_Cedynia_VA'
  368. 'VK2020_POL_Krakow_MA'
  369. 'VK2020_POL_Sandomierz_VA'
  370. 'VK2020_RUS_Gnezdovo_VA'
  371. 'VK2020_RUS_Kurevanikha_VA'
  372. 'VK2020_RUS_Ladoga_VA'
  373. 'VK2020_RUS_Pskov_VA'
  374. 'VK2020_Scotland_Orkney_VA'
  375. 'VK2020_SWE_Gotland_VA'
  376. 'VK2020_SWE_Karda_VA'
  377. 'VK2020_SWE_Malmo_VA'
  378. 'VK2020_SWE_Oland_EVA'
  379. 'VK2020_SWE_Oland_IA'
  380. 'VK2020_SWE_Oland_VA'
  381. 'VK2020_SWE_Skara_VA'
  382. 'VK2020_SWE_Uppsala_VA'
  383. 'VK2020_UKR_Lutsk_MA'
  384. 'VK2020_UKR_Shestovitsa_VA'
  385. 'VK2020_Wales_Anglesey_VA'
  386. 'Wales_CA_EBA'
  387. 'Wales_IA'
  388. 'Wales_LBA'
  389. 'Wales_MBA'
  390. 'Wales_Meso'
  391. 'Wales_MIA'
  392. 'Wales_MIA_LIA'
  393. 'Wales_N'
  394. 'WHG'
  395. 'Yamnaya_BGR'
  396. 'Yamnaya_KAZ_Karagash'
  397. 'Yamnaya_KAZ_Mereke'
  398. 'Yamnaya_RUS_Caucasus'
  399. 'Yamnaya_RUS_Kalmykia'
  400. 'Yamnaya_RUS_Samara'
  401. 'Yamnaya_UKR'
  1. 'Albanian'
  2. 'Armenian_Erzurum'
  3. 'Armenian_Syunik'
  4. 'Armenian_Urfa'
  5. 'Ashkenazi_Belarussia'
  6. 'Ashkenazi_Germany'
  7. 'Ashkenazi_Lithuania'
  8. 'Ashkenazi_Poland'
  9. 'Ashkenazi_Russia'
  10. 'Ashkenazi_Ukraine'
  11. 'Austrian'
  12. 'Avar'
  13. 'Azerbaijani_Dagestan'
  14. 'Azerbaijani_Iran'
  15. 'Azerbaijani_Turkey'
  16. 'Bai'
  17. 'Bajo'
  18. 'Baka'
  19. 'Bakola'
  20. 'Balija'
  21. 'Balkar'
  22. 'Balochi'
  23. 'Baoan'
  24. 'Bashkir'
  25. 'Basque_French'
  26. 'Basque_Navarre_Center'
  27. 'Basque_Navarre_North'
  28. 'Basque_Soule'
  29. 'Basque_Spanish'
  30. 'BedouinA'
  31. 'BedouinB'
  32. 'Belarusian'
  33. 'BelgianA'
  34. 'BelgianB'
  35. 'BelgianC'
  36. 'Berber_Tunisia_Chen'
  37. 'Berber_Tunisia_Sen'
  38. 'Bosnian'
  39. 'Bukharian_Jew'
  40. 'Bulgarian'
  41. 'Chechen'
  42. 'Cherkes'
  43. 'Cossack_Kuban'
  44. 'Cossack_Ukrainian'
  45. 'Croatian'
  46. 'Cypriot'
  47. 'Czech'
  48. 'Danish'
  49. 'Druze'
  50. 'Dutch'
  51. 'Egyptian'
  52. 'EmiratiA'
  53. 'EmiratiB'
  54. 'EmiratiC'
  55. 'English'
  56. 'English_Cornwall'
  57. 'Erzya'
  58. 'Estonian'
  59. 'Finnish_Central'
  60. 'Finnish_East'
  61. 'Finnish_North'
  62. 'Finnish_Southeast'
  63. 'Finnish_Southwest'
  64. 'French_Alsace'
  65. 'French_Brittany'
  66. 'French_Nord'
  67. 'French_Seine-Maritime'
  68. 'French_South'
  69. 'Georgian_Ajar'
  70. 'Georgian_NorthEast'
  71. 'Georgian_West'
  72. 'German'
  73. 'German_East'
  74. 'German_Erlangen'
  75. 'German_Hamburg'
  76. 'Greek_Cappadocia'
  77. 'Greek_Central_Anatolia'
  78. 'Greek_Central_Macedonia'
  79. 'Greek_Corinthia'
  80. 'Greek_Crete'
  81. 'Greek_Crete_Chania'
  82. 'Greek_Crete_Heraklion'
  83. 'Greek_Crete_Lasithi'
  84. 'Greek_Cyclades_Amorgos'
  85. 'Greek_Cyclades_Kea'
  86. 'Greek_Cyclades_Milos'
  87. 'Greek_Cyclades_Tinos'
  88. 'Greek_Peloponnese'
  89. 'Greenlander_East'
  90. 'Greenlander_West'
  91. 'Hungarian'
  92. 'Icelandic'
  93. 'Ingrian'
  94. 'Ingushian'
  95. 'Iranian'
  96. 'Iranian_Fars'
  97. 'Iranian_Zoroastrian'
  98. 'Irish'
  99. 'Italian_Calabria'
  100. 'Italian_Campania'
  101. 'Italian_Northeast'
  102. 'Italian_Piedmont'
  103. 'Italian_Trentino_Alto_Adige'
  104. 'Italian_Tuscany'
  105. 'Italian_Umbria'
  106. 'Italian_Veneto'
  107. 'Jordanian'
  108. 'Karelian'
  109. 'Lak'
  110. 'Latvian'
  111. 'Lebanese_Christian'
  112. 'Lebanese_Druze'
  113. 'Lebanese_Muslim'
  114. 'Maltese'
  115. 'Moksha'
  116. 'Moldovan'
  117. 'Moroccan'
  118. 'Norwegian'
  119. 'Ossetian'
  120. 'Palestinian'
  121. 'Polish'
  122. 'Portuguese'
  123. 'Romanian'
  124. 'Russian_Leshukonsky'
  125. 'Russian_Orel'
  126. 'Russian_Pinega'
  127. 'Russian_Pinezhsky'
  128. 'Russian_Pskov'
  129. 'Saami'
  130. 'Saami_Kola'
  131. 'Sardinian'
  132. 'Saudi'
  133. 'Scottish'
  134. 'Sephardic_Jew'
  135. 'Serbian'
  136. 'Sicilian_East'
  137. 'Sicilian_West'
  138. 'Slovakian'
  139. 'Slovenian'
  140. 'Sorb_Niederlausitz'
  141. 'Spanish_Mallorca'
  142. 'Spanish_Menorca'
  143. 'Spanish_Murcia'
  144. 'Spanish_Navarra'
  145. 'Spanish_Terres_de_l\'Ebre'
  146. 'Spanish_Valencia'
  147. 'Swedish'
  148. 'Swiss_French'
  149. 'Swiss_German'
  150. 'Swiss_Italian'
  151. 'Syrian'
  152. 'Syrian_Jew'
  153. 'Tarkhan_Sikh/Hindu'
  154. 'Tu'
  155. 'Tubalar'
  156. 'Tujia'
  157. 'Tunisian'
  158. 'Tunisian_Berber_Matmata'
  159. 'Tunisian_Berber_Tamezret'
  160. 'Tunisian_Berber_Zraoua'
  161. 'Tunisian_Douz'
  162. 'Tunisian_Jew'
  163. 'Tunisian_Rbaya'
  164. 'Turkish_Rumeli'
  165. 'Turkish_Trabzon'
  166. 'Ukrainian_Chernihiv'
  167. 'Ukrainian_Dnipro'
  168. 'Ukrainian_Lviv'
  169. 'Ukrainian_Rivne'
  170. 'Ukrainian_Sumy'
  171. 'Ukrainian_Zakarpattia'
  172. 'Ukrainian_Zhytomyr'
  173. 'Vepsian'
In [18]:
myavgs = avgs %>% select(-V1)
row.names(myavgs) = avgs$V1
mymodavgs = modavgs %>% select(-V1)
row.names(mymodavgs) = modavgs$V1
Warning message:
“Setting row names on a tibble is deprecated.”
Warning message:
“Setting row names on a tibble is deprecated.”

Combine all data

In [19]:
allavgs = bind_rows(myavgs,mymodavgs)

Find nearest neighbors for each row

In [27]:
mynn = kNN(allavgs,k = nrow(allavgs)-1)
mynn
k-nearest neighbors for 995 objects (k=994).
Available fields: dist, id, k, sort
In [28]:
mynn = rownames_to_column(as.data.frame(mynn$id)) %>% as_tibble() %>% mutate(`0` = row_number(),.after = rowname)
In [29]:
mynn %>% sample_n(10)
A tibble: 10 × 996
rowname012345678⋯985986987988989990991992993994
<chr><int><int><int><int><int><int><int><int><int>⋯<int><int><int><int><int><int><int><int><int><int>
VK2020_GreenlandE_VA 767765940914870802772771758⋯978559656598554586236237841840
VK2020_SWE_Karda_VA 797 19756887168 23163175305⋯978559656598554586236237841840
Iberia_East_IA 332345342849341851848349890⋯559656653554598586236237841840
Sarmatian_KAZ_Bisoba 661666663659665667310660662⋯638598656554586978236237841840
RUS_Sidelkino_HG 625648582649622650780583749⋯571838564656236554978237841840
BGR_EBA 45517276424 30 42461 46411⋯559653656554598586236237841840
Spanish_Menorca 964352967963968965944925415⋯559653656554598586236237841840
Swiss_German 971888886857856447408305351⋯638559656598554586236237841840
HUN_North_Transdanubia 310661666660663667665659658⋯638598656978554586236237841840
HUN_Avar_Middle-Late_Danube-Tisza273268594285263563645272643⋯137468811174219512384429841840

Pivot and clean up the data

In [30]:
mynames = mynn %>% select(rowname) %>% mutate(val = row_number())
In [31]:
mynn = mynn %>% pivot_longer(cols = matches("^\\d"),names_to = "rank",values_to = "val") %>% inner_join(mynames,by = "val") %>% select(c(1,4,2))
In [32]:
colnames(mynn) = c("target","population","distance")
mynn = mynn %>% dplyr::filter(distance %in% 1:5)
In [33]:
mynn %>% sample_n(10)
A tibble: 10 × 3
targetpopulationdistance
<chr><chr><chr>
Turkish_Trabzon Greek_Cappadocia 4
IRN_Hajji_Firuz_CTurkish_Trabzon 4
BelgianB French_Nord 1
Danish VK2020_DNK_Sealand_VA3
BelgianC Swiss_German 4
Swedish VK2020_SWE_Skara_VA 4
Russian_Pskov DEU_MA_Krakauer_Berg 3
HUN_Starcevo_N HUN_LBK_MN 3
Moksha Ingrian 3
HRV_Pop_RomanP SVK_LIA 5
In [34]:
write_csv(mynn,"../Genetics/G25/Data/CSV/top_10_dist_all.csv")
In [35]:
mynn %>% dplyr::filter(str_detect(target,"^Ukrainian.+[^o]$"))
A tibble: 30 × 3
targetpopulationdistance
<chr><chr><chr>
Ukrainian_Chernihiv Ukrainian_Rivne 1
Ukrainian_Chernihiv Ukrainian_Dnipro 2
Ukrainian_Chernihiv Ukrainian_Sumy 3
Ukrainian_Chernihiv Ukrainian_Zhytomyr 4
Ukrainian_Chernihiv Russian_Orel 5
Ukrainian_Lviv Ukrainian_Zakarpattia1
Ukrainian_Lviv Polish 2
Ukrainian_Lviv Ukrainian_Sumy 3
Ukrainian_Lviv Ukrainian_Zhytomyr 4
Ukrainian_Lviv Ukrainian_Rivne 5
Ukrainian_Rivne Ukrainian_Chernihiv 1
Ukrainian_Rivne Ukrainian_Zhytomyr 2
Ukrainian_Rivne Ukrainian_Dnipro 3
Ukrainian_Rivne Polish 4
Ukrainian_Rivne DEU_MA_Krakauer_Berg 5
Ukrainian_Sumy Ukrainian_Chernihiv 1
Ukrainian_Sumy Russian_Orel 2
Ukrainian_Sumy Ukrainian_Rivne 3
Ukrainian_Sumy Ukrainian_Zhytomyr 4
Ukrainian_Sumy Ukrainian_Dnipro 5
Ukrainian_ZakarpattiaUkrainian_Lviv 1
Ukrainian_ZakarpattiaCroatian 2
Ukrainian_ZakarpattiaHungarian 3
Ukrainian_ZakarpattiaSlovakian 4
Ukrainian_ZakarpattiaSlovenian 5
Ukrainian_Zhytomyr Ukrainian_Rivne 1
Ukrainian_Zhytomyr Ukrainian_Chernihiv 2
Ukrainian_Zhytomyr Ukrainian_Sumy 3
Ukrainian_Zhytomyr Polish 4
Ukrainian_Zhytomyr Ukrainian_Dnipro 5
In [36]:
mynn$distance = as.numeric(scale(as.numeric(mynn$distance)))
In [37]:
myg = graph_from_data_frame(mynn)
In [38]:
myg
IGRAPH b80b3be DN-- 995 4975 -- 
+ attr: name (v/c), distance (e/n)
+ edges from b80b3be (vertex names):
 [1] ARM_Areni_C     ->Levant_Megiddo_MLBA_o1
 [2] ARM_Areni_C     ->Armenian_Syunik       
 [3] ARM_Areni_C     ->ARM_LBA               
 [4] ARM_Areni_C     ->Ostrogothic_Crimea_ACD
 [5] ARM_Areni_C     ->Greek_Cappadocia      
 [6] ARM_LBA         ->ARM_MBA               
 [7] ARM_LBA         ->ARM_Lchashen_MBA      
 [8] ARM_LBA         ->RUS_Alan_MA           
+ ... omitted several edges
In [39]:
myg = as_tbl_graph(myg) %>% mutate(c_auth = centrality_authority(weights = distance))
myg
# A tbl_graph: 995 nodes and 4975 edges
#
# A directed simple graph with 1 component
#
# A tibble: 995 × 2
  name               c_auth
  <chr>               <dbl>
1 ARM_Areni_C      0       
2 ARM_LBA          1.16e- 9
3 ARM_Lchashen_MBA 0       
4 ARM_MBA          0       
5 AUT_IA_La_Tene   7.69e- 6
6 AUT_LBK_N        7.31e-11
# ℹ 989 more rows
#
# A tibble: 4,975 × 3
   from    to distance
  <int> <int>    <dbl>
1     1   502   -1.41 
2     1   825   -0.707
3     1     2    0    
# ℹ 4,972 more rows
In [40]:
myg %>% select(name,c_auth) %>% as.data.frame() %>% arrange(desc(c_auth))
A data.frame: 995 × 2
namec_auth
<chr><dbl>
England_CA_EBA 1.000000000
Bell_Beaker_England 0.173952611
England_LIA 0.116812306
England_MBA 0.109148862
Corded_Ware_CHE 0.089944309
ISL_Viking_Age_Pre_Christian0.081853731
Bell_Beaker_NLD 0.051372059
Scottish 0.051015268
SWE_Ollsjo_BA 0.049313911
POL_Chlopice_Vesele_Culture 0.046520461
CZE_Unetice_EBA 0.033693051
Bell_Beaker_CZE_late 0.029849571
England_EastYorkshire_LIA 0.029163725
Scotland_LIA 0.028726403
CZE_Unetice_preC 0.027456141
Scotland_LBA 0.025179970
England_MIA_LIA 0.024486887
England_LBA 0.024417841
VK2020_DNK_Sealand_VA 0.023105356
VK2020_SWE_Skara_VA 0.021617028
England_EIA 0.021563486
CZE_Bilina_BA 0.016005905
VK2020_NOR_North_VA 0.014743177
VK2020_IRL_Eyrephort_VA 0.010155855
Bell_Beaker_Bavaria 0.009698243
Scotland_EIA 0.009426029
English 0.008341500
England_Saxon 0.007196527
SWE_LN 0.006523234
VK2020_EST_Saaremaa_EVA 0.006340086
⋮⋮
Lebanese_Muslim 0
Maltese 0
Norwegian 0
Ossetian 0
Palestinian 0
Portuguese 0
Russian_Pinega 0
Russian_Pinezhsky 0
Russian_Pskov 0
Saami 0
Sardinian 0
Saudi 0
Sephardic_Jew 0
Slovenian 0
Sorb_Niederlausitz 0
Spanish_Murcia 0
Swedish 0
Swiss_French 0
Tarkhan_Sikh/Hindu 0
Tu 0
Tunisian 0
Tunisian_Berber_Matmata0
Tunisian_Jew 0
Turkish_Rumeli 0
Turkish_Trabzon 0
Ukrainian_Chernihiv 0
Ukrainian_Dnipro 0
Ukrainian_Rivne 0
Ukrainian_Zhytomyr 0
Vepsian 0
In [41]:
options(repr.plot.width = 26,repr.plot.height = 30,repr.plot.res = 400,repr.plot.bg = "gray20")
In [42]:
#myp = myg %>% ggraph(layout = 'unrooted')#,circular = TRUE) 
myp = myg %>% ggraph(layout = 'stress',circular = TRUE) 

myp + geom_edge_fan(aes(color = distance,alpha = after_stat(index),start_cap = label_rect(node1.name),end_cap = label_rect(node2.name)),strength = 1.25,edge_width = 0.15) + 
        geom_node_label(aes(label = name),label.size = 0,size = 1,alpha = 0.6,fill = "gray15",color = "gray75") + #,repel = TRUE,max.overlaps = 10) + 
        scale_edge_alpha('Edge direction') + 
        scale_edge_color_gradient2("distance",low = "cyan",mid = "snow",high = "orange") + 
        theme_void() + 
        theme(plot.background = element_rect("gray20"),legend.title = element_text(color = "snow"),legend.text = element_text(color = "snow"),legend.position = "bottom") +
        guides(edge_alpha = guide_edge_direction()) + 
        coord_flip()
Warning message:
“Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.”
No description has been provided for this image